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ABSTRACT 

Engagement during reading can be measured by the amount of time 
readers invest in the reading process. It is hypothesized that 
disengagement is marked by a decrease in time investment as 
compared with the demands made on the reader by the text. In this 
study, self-paced reading times for screens of text were predicted 
by a text complexity score called formality; formality scores 
increase with cohesion, informational content/genre, syntactic 
complexity, and word abstractness as measured by the Coh-Metrix 
text-analysis program. Cognitive decoupling is defined as the 
difference between actual reading times and reading times 
predicted by text formality. Decoupling patterns were found to 
differ as a function of the serial position of the screens of text and 
the text genre (i.e., informational, persuasive, and narrative) but 
surprisingly not as a function of reader characteristics (reading 
speed and comprehension). This underscores the importance of 
mining text characteristics in addition to individual differences and 
task constraints in understanding engagement during reading. 
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1. INTRODUCTION 

Engagement during reading is essential for comprehension and 
learning [1]. Methods for gauging engagement include measuring 
time invested in the reading process and eye tracking [2-5]. We 
hypothesize that when mind wandering or other forms of 
disengagement occur, there is a marked decrease in time allocation; 
text characteristics then have little impact on reading times. The 
disjoint relationship between textual demands and time investment 
is termed decoupling. Cognitive decoupling is defined as the 
difference between actual reading times and reading times 
predicted by text characteristics. 


This study investigates how engagement changes as a reader 
progresses through screens of text in moderately lengthy 
documents. Changes are expected to be moderated by 
characteristics of reader and text. Relevant reader characteristics 
included overall reading speed and comprehension; text 
characteristics included text difficulty and genre. 

1.1 Text Difficulty 

Text difficulty can been scaled in a variety of ways, validated by 
predicting grade levels of text and performance on psychometric 
tests of comprehension [6]. The Flesch- Kincaid Grade Level 
formula is a readability assessment based on word length and 
sentence length [7] . The Coh-Metrix tool analyzes text on multiple 
levels of language and discourse using computational linguistics 
techniques [8, 9]. Graesser et al [10] have introduced formality as 
a composite measure of text difficulty based on Coh-Metrix higher 
order principal components. Formality has a high correlation (0.72) 
with Flesch- Kincaid Grade Level. Discourse formality is calculated 
as a mean of five Coh-Metrix principal components having positive 
values for increasing levels of difficulty. These include: (1) 
referential cohesion; (2) deep (causal) cohesion; (3) informational 
content; (4) syntactic complexity and (5) word abstractness. 
Normative values (z-scores) for these 5 factors and formality are 
based on the TASA corpus. These norms are used to compute 
difficulty scores on new texts that researchers wish to analyze. 

1.2 Genre and Order of Information 

Genre is a discourse feature that is expected to influence 
engagement as well as text difficulty. Narrative texts are considered 
the most intrinsically engaging genre for most readers; and least 
difficult, compared with informational texts [6], [9], [11, 12]. 
Persuasive texts lie in-between narrative and informational text in 
expected difficulty and engagement. 

The order of information presented in the text is also expected to 
influence engagement as well as text complexity. Readers begin 
engaged with a text, but may eventually lose interest and disengage 
as the text progresses. Research is needed to document the time 
allocated to texts at different points in the text. Interestingly, basic 
research questions have not yet been investigated at a fine grained 
level. Available research has only compared mind wandering as a 
function of texts that vary in difficulty as entire texts and these 
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Figure 1. Reading Time per Word as a Function of Screen 
Serial Position, Segregated by Genre and Reader Type 


studies are not consistent with respect to mind wandering 
increasing or decreasing with text difficulty [13]. 


1.3 Decoupling 

Cognitive decoupling is a discrepancy between textual demands 
and the time a participant invests in reading a text. Decoupling 
increases as a function of the readers’ disengagement with the text. 
Decoupling in this study is measured as the difference between 
actual reading times and times predicted by text characteristics. We 
interpret positive decoupling scores to indicate that a participant is 


investing more time in reading a text than the text characteristics 
demand. According to our assumptions, negative values of 
decoupling represent a participant investing less time than text 
characteristics’ demands. The Coh-Metrix formality z-scores were 
used to measure text difficulty of a text, as normalized by the TAS A 
corpus. Analogously, the reading time for each text segment was 
normalized through z-scores for individual readers on the mean 
reading time per word for the text segment under consideration 
(compared with the other text segments for that individual). 
Decoupling is normalized reading times for a particular person 
minus the normalized text difficulty based on the TASA corpus. 

We predict that decoupling scores will become more negative or 
less positive as a reader progresses through a text, corresponding 
with a decrease in engagement. However, previous research [14] 
has not identified the shape of this decreasing function for different 
categories of texts and readers. These effects are predicted to be 
moderated by reader characteristics and genre. 

2. METHODS 

This study had 254 participants in two groups: 128 participated 
online via Mechanical Turk; 126 undergraduate Psychology 
students participated in a lab study. 

Participants were classified according to reading time and 
comprehension using the Nelson Denny assessment with median 
split criteria. Participants read one text from each of three genres in 
counterbalanced order; texts assigned were randomly sampled from 
24 informational, 24 persuasive, and 25 narrative texts. Following 
reading, participants wrote a 75-100 word summary of each text; 
then rated the familiarity, value, and interest for each text. 

Participants used the spacebar to advance through each screen, 
providing reading time measurements Self-paced reading times 
were measured as average time per word in milliseconds for each 
screen of text. The number of words per screen ranged from 79 to 
131, with a mean of 88.8 and a standard deviation of 11.0. The 
number of screens ranged from 10 to 23 per text. 

3. RESULTS 

3.1 Word Reading Times as Function of Text 
and Reader Characteristics 

Mean reading times per word are presented as a function of serial 
position of screens of text, through position 14. Figure 1 shows 
times for informational (la), persuasive (lb), and narrative texts 
(lc). Participants are segregated into slow versus fast readers and 
high versus low comprehenders. 

In Figure 1, reading time functions are similar for readers with 
differing comprehension levels and reading speeds. We fit linear 
functions to each reader’s times as a function of serial position, 
performing an ANOVA on the slopes. As expected, the slopes were 
negative, reflecting serial reading time decreases. A significant 
effect appeared in the Genre x Reading Time x Comprehension 
ANOVA: the slopes were lower for fast than slow readers, F (1, 
748) = 16.54, p < .001. Intercepts were lower for fast readers, F (1, 
748) = 153.93, p < .001. No other significant effects or interactions 
appeared, indicating individual differences had minimal impact on 
raw reading time functions. Predicted reading time per word on a 
page RT’ follows the function: RT’ (milliseconds per word) = 536 
-10 * serial position (SP) of screen. 

There did appear to be a dip in early serial positions and then a 
leveling off. Therefore we fit a quadratic equation to the reading 
time data. When averaging over the reader groups, the resulting 
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predictive equation was RT’ = 409 + -23* SP + 88*SP 2 . The 
improvement in the quadratic equation over the linear function was 
small when fitting curves to mean data points, R 2 = 0.97 versus 
0.88, respectively. Moreover, the only coefficient that showed any 
differences in the Genre x Reading Time X Comprehension 
ANOYA was the intercept, which was lower for faster readers, F 
(1, 748) = 79.95, p < .001 In summary, the raw reading times 
showed decreases over serial position and a slight quadratic trend, 
but did not unveil differences in genre or individual differences. 

3.2 Formality as a Function of Text Formality 
and Genre 

It is possible that the above trends in decreasing reading times over 
serial position could be explained by characteristics of the text, as 
opposed to the readers’ strategies (implicit or explicit) in allocation 
of reading time. We conducted an analysis of formality scores as a 
function of serial position, segregating the three text genres. These 
formality scores are plotted in Figure 2 for serial positions 1-14. 
The slopes for each genre were essentially flat as a function of serial 
position, with mean slopes of 0.00, 0.07, and 0.1 1 for informational, 
persuasive, and narrative texts, respectively. Therefore, decreasing 
trends in reading times cannot be attributed to systematic changes 
in text characteristics over serial positions. 

In contrast, formality scores differed by genre, as consistent in 
previous studies [10]. The mean formality scores were 0.18, 0.09, 
and -0.26 for informational, persuasive, and narrative texts, 
respectively. These differences were significantly different, p < 
.001, showing the predicted ordering of informational > persuasion 
> narrative. Therefore, text characteristics varied over genre but not 
serial position. 
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Figure 2. Formality as a Function of Screen Position, 
Segregated by Genre 

3.3 Decoupling as a Function of Genre, Serial 
Position, and Reader Characteristics 

It is possible that decoupling, rather than raw reading times, 
provides a more sensitive approach to analyzing disengagement. 
Figure 3 shows the decoupling scores for informational (3a), 
persuasive (3b), and narrative texts (3c). The participants are 
segregated into slow versus fast readers and high versus low 
comprehenders. As in the raw reading times, there did appear to be 
a dip in early serial positions and then a leveling off with a slow 


descent. The only exception was a slight upward trend for the 
narrative texts at the very end. When we fit a linear function to all 
of the participants for all of the texts, the best fit regression line 
yielded an R 2 =.63. A quadratic equation had a significant increase 
in variance explained of R 2 =.88. The best fit function was 
Decoupling’ = 0.835 -0.204*SP + 0.010*SP 2 . When we conducted 
a Genre x Reading Time x Comprehension ANOVA, there was 
only one significant effect. There was a significant effect of genre 
for the three coefficients in the quadratic function: F (2, 748) = 
36.37; F (2, 748) = 8.46, p < .001, F (2, 748) = 11.00, allp < .001. 
There were no significant individual differences (reading speed or 
comprehension) and no interactions. 

4. DISCUSSION 

This study has revealed how reading times and cognitive 
decoupling are significantly influenced by text characteristics, 
namely genre and the serial position of information in the text. The 
pattern of results showed higher engagement (reflected in 
decoupling scores) in the first few screens of text and a subsequent 
decrease over the serial position of the screens. The deepest 
engagement is in the first 200-400 words, then noticeably decreases 
and slowly decreases thereafter (aside from an interesting upsweep 
for narrative texts). The quadratic function captures this trend and 
shows a better fit than a linear trend. It is of course strategically 
wise to pay attention to the early text segments because that is a 
critical point when the situation model is set up [11, 14], and the 
reader can make judgments whether the text is interesting or 
important to continue reading [1]. It is important to acknowledge 
that text difficulty is not comparatively high in early text segments, 
as shown in Figure 2, so increased time allocation at the beginning 
of a text cannot be attributed to text difficulty. 

Regarding decoupling scores, text formality and difficulty show the 
following trend compatible with previous research using Coh- 
Metrix [2, 10]: informational > persuasive > narrative. However, 
cognitive decoupling showed the opposite ordering, such that 
readers tended to over allocate reading times to narrative text and 
under-allocate for the difficult informational text. In essence, there 
was a tendency to have lower engagement when the text was more 
difficult. The role of text difficulty has also been found to predict 
mind- wandering during text comprehension [13, 15] and listening 
to lectures [16], but the jury is still out as to (a) whether mind 
wandering is more prevalent in discourse that is very easy or very 
difficult and (b) what level of discourse analysis is most diagnostic 
of mind- wandering. Future research awaits an analysis of the 
impact on decoupling as computed via a deviation between reading 
time and formality and mind wandering. 
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